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DeFOREST’S  formula  for  “an  unsymmetrical 

PROBABILITY  CURVE” 


In  presenting  a long-forgotten  investigation  by  E.  L.  DeFor- 
est  (’82-’83)  on  “an  unsymmetrical  probability  curve,”  the  writer 
wishes  to  call  attention  to  the  fact  that  the  first  systematic  analy- 
sis of  the  subject  was  attempted  by  DeForest  and  as  a result  he 
obtained  a formula  which  is  identical  with  that  for  Professor 
Pearson’s  (’95)  generalized  probability  curve.  DeForest  suggests 
further  that  by  retaining  the  higher  derivatives  a more  general 
formula,  of  which  the  formula  already  found  will  be  a particular 
case,  may  be  obtained  from  his  original  differential  equation. 
Thus  DeForest’s  investigation  is  not  only  interesting  from  an 
historical  standpoint,  but  still  more  from  the  fact  that  the  same 
formula,  though  in  different  terms,  has  been  derived  from  entirely 
different  methods  of  analysis  by  Professor  Pearson.  This  fact 
furnishes  good  evidence  as  to  the  validity  of  Professor  Pearson’s 
theoretical  assumption. 

As  the  investigation  was  published  a number  of  years  ago,  the 
original  paper  by  DeForest  is  difficult  to  obtain,  and  so,  for  the 
reader  who  is  anxious  to  see  the  method  of  mathematical  analysis 
adopted  by  him,  I venture  to  present  in  the  following  pages  some 
of  the  important  points  which  directly  concern  the  derivation  of 
his  final  formula.  I shall  also  add  a mathematical  process  of 
transformation  of  Professor  Pearson’s  formula  to  that  of  DeForest. 
For  numerous  other  important  and  interesting  points,  the  reader 
must  refer  to  the  original  memoirs. 

DeForest  employed  this  reasoning: 

Let  the  following  be  a given  polynomial 

X—  m Z~m  + + X_1  Z~l  + Xo  + Xj  Zl  + + \mZm.  (1) 
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Its  expansion  to  the  k power  may  be  written 

l—Km  Z~Km  + Z_x  Z~x  + lo  -H  l\  Zl  -f- -f-  lKm  ZKm.  (2) 

From  the  relations 

(x_m  z-m  + ....  +xm  zmy  = i_Km  z-*m  + ....  + iKm  ZKm 

we  have 

k log  (X_m  Z~m  +....  + Xm  Zm ) = log  ( l~Km  Z~Km  + ....+  lKm  ZKm ) 

which  holds  good  for  all  values  of  Z.  By  differentiation  with 
respect  to  Z and  then  clearing  of  fractions  it  becomes 

K{—m\-.mZ~m~l.  . . . +m\mZm~l)  (l-KmZ~Km  + . ...  + lKm  ZKm).  — ^ 
(X_m  Z~m  XOT  Z ) (—Kml—Km  Z~ Km~ 1 ....-(-  Kmlicm  ZKm~1‘') 

Forming  the  coefficient  of  Zl~'  in  the  polynomial  product,  and 
remembering  also  that  the  rank  of  the  middle  l of  this  group 
reckoned  from  l0  is  i,  we  get,  by  equating  the  two  to  each  other 
by  the  principle  of  undetermined  coefficients, 

K(rm\-Ji+m = (<+*)X_J^+ + B™)XmZf_m. 

In  the  second  member,  let  that  part  which  does  not  have  the  co- 
efficient i be  transferred  to  the  first  member,  then 

“ +mXmZj_TO=  — (X_m£j+m+  . . . +Xm4'-m)  • (4) 

K+l 

Clearly  then  any  coefficient  k in  the  expansion,  and  the  2 m co- 
efficients nearest  to  it,  will  be  connected  by  the  relation 

(Xdi_i  — X_iit-+i) +2(X2ij— 2 — X_2^+2)-t-  • • ■ • +"*(XmZ;_TO  — \~mli+m)  _ i s 

XqZ;  + + + (X2Zi_2  + X_  2^+2)  +•••  + Ymk-m  + X-mCf  m)  K+l 


This  is  the  fundamental  principle  of  DeForest’s  analysis  in  his 
numerous  interesting  studies  on  the  theory  of  probability.  Let 
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li+1,  etc.,  in  (5)  be  expressed  in  terms  of  Z;  and  their 
differences.  For  this  DeForest  refers  to  a convenient  formula 
given  by  Lacroix  (Cal.  diff.  et  integ.,  Paris,  1819)  as  follows: 


Zl-+n=Zi+-Ai+2-A2+ 


n{ji2-  l2).  , n2(n2-l2).  , n(n2-l2)  (n2-22) 


3! 


— A3+ 


5! 


A5+etc.  (6) 


For  brevity  let  us  write  also 

&o  = X0+  (Xi+X— 1)+  (X2+X_2)  + + (Xm+X_m) 

bi  — 1 (Xi  — X_d+2  (X2  — X_2)  + ....  -\-m  (Xm  — X_m) 

&2=  l2(Xi+X_i)  +22(X2+X_2)  + ....  — X_m) 

&3=  1 3 (Xx — X_x)  + 23  (X2  — X_2)  + ....  +rn3(Xm-X_J  (7) 

etc.,  etc. 

Denoting  the  numerator  and  denominator  in  the  first  member  of 
(5)  by  N and  D respectively,  we  get 


N — bili  — 62AH — &3  A2  — — ( bi  — fe2)  A3  -(- — (65 — 63)  A4 
2 3!  4! 

— {bfi  — 5f>4+462)A5+ — (67  — 565+4&3)A6 

5 ! 6! 

— ^ - (&8  — 8&6+19&4-  1262)A7+ 

D — b0li  — feiAi+-62A2  — — (&s  — 61)  A^dr  — (64  b^)  A4 


— — (65  — 5&3  -f-  4&i)  A5  - j-  — (&6  — 564 + 4&2)  As 

5!  6! 

— — (67  — 8&5+19&3  — 12fe1)A7+ 

7! 


(8) 


When  k becomes  infinite,  and  the  successive  values  of  l are  re- 
garded as  consecutive  ordinates  to  a limiting  curve,  we  have 


k=y 


Ai  = dy.  A2  = d2y  A 3 = d?y,  etc., 
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and  at  the  same  time  when  the  ordinates  are  set  close  together, 
the  abscissa  x corresponding  to  any  y is  x = idx.  Thus  (8)  be- 
comes the  differential  equation  of  the  curve,  and  b0,  bh  b2,  etc., 
are  constants,  and  in  fact  are  the  successive  moments  of  the  area 
bounded  by  the  curve  and  the  axis  of  abscissas,  these  moments 
being  taken  about  a vertical  axis.  Since  any  given  polynomial 
may  be  reduced  to  one  in  which  2(\)  = 1,  by  dividing  it  through- 
out by  the  sum  of  its  coefficients,  we  therefore  consider  b0  = 1. 
If  a constant  number  is  added  to  or  subtracted  from  all  the  ex- 
ponents of  z in  (1),  it  will  not  alter  the  value  of  l in  (2).  Hence 
by  making  Z°  the  abscissa  of  the  center  of  gravity,  bi  becomes 
zero.  Then  any  constant  bn  in  (7)  will  denote  the  sum  of  the 
products  formed  by  multiplying  each  X into  the  nth  power  of  its 
abscissa  reckoned  from  the  new  origin,  if  the  common  interval 
Ax  between  the  abscissa  is  regarded  as  unity.  With  the  above 
transformations,  we  may  now  write  (8)  in  the  following  forms: 


In  the  denominator  of  the  first  member  let  d2y , d3y,  etc.,  be 
neglected  in  comparison  with  y and  in  the  numerator  let  d3y,  dAy, 
etc.,  be  neglected  in  comparison  with  dy.  Since  k is  infinitely 
large,  we  may  write  k instead  of  k +1. 

Therefore 


Invert  both  members  of  this  equation,  subtract  |(63-r-fe2)  from 
each  and  invert  them  both  back  again.  This  gives 


b2dy  — %b3d2y + 1 (64  — b2)  d3y  — etc . _ — x 

y+lb2diy-\bid3y+  etc.  {n+l)dx 


(9) 


dy-\(b3  + bl)d2y 

y 


— x 
nb2dx 


dy-%(b3+b2)d2y 


—x 


(10) 


y — h(bs^d2)dyJrl(bs  h-  d2) 2d2y  K.b2dx  + |(63  b2)x 


Thus  far  we  have  carried  on  our  treatment  on  the  assumption 
that  the  origin  of  Z°  in  the  expansion  is  located  at  the  center  of 
gravity  for  the  coefficient  l in  (2),  which  became  the  ordinate  y 
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to  the  limiting  curve.  Now  in  (10)  let  the  origin  be  transferred 
from  the  center  of  gravity  to  another  convenient  point  by  put- 
ting 


2 1 cb,tdx 
V 


(ID 


in  place  of  x.  This  gives 


dy  — \ (63  b2)  d2y  _ 4 Kb2dx  — 2 (63  -f-  b2)  x 

y~i(b3  + b2)dy  (b3  + b2)2x 

In  the  first  member,  the  numerator  is  the  differential  of  the  denom- 
inator. Without  any  further  change  of  origin,  we  can  write 
approximately  as  follows: 


y = y + !(&s  b2)dy,  x = x + |(63  + b2)dx 


Neglecting  d3y  in  the  numerator  and  d2y  in  the  denominator,  we 
get 

. dy  _ inb2dx  — (b3  -p  b2)2dx  --  2 (b3  -j-  b2)x 
~7~  (6a  - b2)2[x  + K63  - 62) dx] 


Since  the  denominator  y in  the  first  member  is  supposed  to  be 
infinitely  greater  than  the  numerator  dy,  the  denominator  in  the 
second  member  must  be  infinitely  greater  than  its  numerator, 
so  that  in  the  denominator  we  may  neglect  dx  in  comparison 
with  x.  Further  let  the  constants  be  expressed  by  means  of  the 
two  new  constants 


262  (dx) 2 
~b3(dx)3  ’ 


b ='  nb2(dx)2. 


(13) 


Since  a is  supposed  to  be  an  infinity  of  the  second  order,  b rep- 
resents a finite  area.  The  equation  will  now  stand 


— = - X(a2b  — 1)  — adx, 

y x 

and  integration  gives 

log  y = ( a2b  — l)log  x — ax  + log  C 
y = Cxa2b~l  e~ax 


(14) 


(15) 


It  now  remains  to  determine  the  constant  C in  (15).  Since 
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2(X)  =1  in  the  given  polynomial  and  2(Z)  = 1 in  its  expansion,  we 
shall  have  2(y)  = 1 in  the  formula  (15).  The  y which  DeForest 
uses,  represents  an  elementary  area.,  so  that  it  should  be  under- 
stood to  mean  ydx  in  modern  notation.  Thus  equation  (18), 
omitting  dx,  gives  the  equation  of  the  curve.  Thus  we  have  in 
DeForest’s  notation: 


1 f ydx  = 1 -2  f (ax)  a'~b  xe  axd(ax ) = 1, 
dxJ  o a dx  Jo 

which  gives  at  once  the  value  of  C and  we  have 

adx 


r (a2b) 


(ax) 1 


(16) 


the  complete  equation  of  the  curve  sought. 

If  we  now  transfer  the  origin  of  coordinates  to  the  center  of 
2 ntfdx 

gravity  by  putting  a:  + — — — in  (11)  or  x .+  ab  in  place  of  x 
in  (16),  we  have 

,_*(?»)■*{!+ a y 

abT(a2b)\  e ) \ ab) 

Applying  a known  formula  for  r(w) 

(17)  is  reduced  to 


(17) 


1 + — - + - 1 + etc. 

12w  288  n2 


dx 


KV2wb 


1+ 


ab 


(18) 


where  k = 1+  --1 — H vvtt;  + etc- 

12o26  288  (a2b)2 


Returning  to  the  meaning  of  the  constants,  a in  (13)  may  be  writ- 
ten 

s/AW)  (19) 

\b 3 (dx)3)  XMdx)3) 

This  shows  that  the  part  within  the  parenthesis  may  be  regarded 
as  the  square  of  the  quadratic  radius  divided  by  the  cube  of  the 
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cubic  radius,  either  in  the  first  power  of  the  polynomials  or  in 
its  expansion  to  the  k power. 

The  value  of  a and  b may  thus  be  expressed  by  means  of  the 
coefficients  X in  the  given  polynomial,  or  by  means  of  the  ordi- 
nates y to  the  limiting  curve.  When  the  X’s  and  y’s  are  all  pos- 
itive Kb2(dx)2  is  the  square  of  the  quadratic  mean  error  “e”  and 
Kb3(dx)3  is  the  cube  of  what  DeForest  calls  the  cubic  mean  in- 
equality “f.” 

The  constants  in  (13)  will  then  be 

a=2e2-K3  6=e2 

It  will  be  seen  then  that  the  constants  e2  and  f3  are  respectively 
the  second  and  third  moments  of  Pearson  and  therefore  can  be 
advantageously  determined  by  his  method.  The  above  sketch 
should  enable  the  reader  to  get  an  idea  of  the  method  of  DeForest’s 
analysis,  and  this  was  my  object  in  presenting  it.  The  proper- 
ties of  the  formula  as  well  as  the  method  of  transformation  of 
the  present  formula  to  the  normal  probability  form  are  adequately 
treated  in  the  original  paper  of  DeForest.  However,  regarding 
these  points,  the  reader  will  get  still  better  information  from 
Pearson’s  discussion  on  his  curve  of  Type  III. 

Although  I have  not  given  the  process  of  transformation  of 
the  formula  to  the  normal  form,  DeForest’s  statement  in  this 
connection  will  be  worth  noting.  He  states  that  he  would  have 
obtained  the  normal  form  directly  from  the  equation  (9)  if  he 
had  neglected  d2y.  If  instead  of  retaining  only  dy  and  d2y  he 
should  also  retain  d3y,  the  resulting  equation,  provided  such  is 
integrable,  would  doubtless  give  a limiting  curve  of  a still  more 
general  form,  of  which  the  curve  derived  from  (18)  is  but  a par- 
ticular case.  Thus  he  thought  that  the  probability  curve  and 
his  curve  (18)  are  only  the  first  and  second  approximations  to 
the  actual  form  of  an  expansion  to  a high  power. 

From  the  foregoing  discussion  the  reader  will  notice  a close 
similarity  between  DeForest’s  formula,  and  Pearson’s  formula 
for  the  curve  of  Type  III.  For  convenience,  I shall  enumerate 
some  of  the  similar  properties  in  these  two  curves. 
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(1)  Both  are  the  skew  binomial  curves. 

(2)  The  curve  is  limited  on  one  side  the  mean. 

(3)  The  analytical  constants  are  determined  from  the  first 
three  moments. 

(4)  Both  can  be  reduced  to  the  normal  form. 

(5)  Each  is  a particular  case  of  a more  general  formula. 

It  will  be  demonstrated  in  the  following  pages  that  although 
these  two  formulas  show  no  more  apparent  similarity  yet  the 
formulas  are  identical: 


From  the  differential  equation 


1 dy_ 
y dx 


A»2  + 


Professor  Pearson  obtained  his  formula  for  the  curve  of  Type  IIT 
which  is  usually  written  in  the  following  form: 

xYa  ' 


V 


1 + 


a evT{p+l) 

The  following  relations  are  also  given 
p -f~  1 2 /jl0  4 M! 


a) 


(20) 


v — — - ;p  = — - — 1 ; a 
m3 


Since  the  distance  of  the  centroid  vertical  from  the  axis  of  y or 
maximum  ordinate  is  \ by  changing  the  value  of  x,  that  is, 


putting 


x = x T b 1 


(20)  is  reduced  into  the  following 

4-  1 / O 

V 


y aVr(p+l) 


1 + 


r(p+D  ( i - ) 

\ 4 u'  / 


2m;  _ m3 

Ms  2M2 


1 + 


2m!  -h 


[x  + , 

Ms  \ 


4m5. 
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a 


V 2tt/x2  Vv 

r(p  + i) 


— (i+  W"1 

/ p y V 2^  Ik  / 

V p + 1/ 


_ 2mj 

e Ms 


tl/ 


a F 2x(p  + 1)  a ppe  (P+1) 

2 ^ r(p  + D— 

(p+l)p 


and  finally,  as  the  result  of  transferring  the  origin  to  the  centroid 
vertical,  we  obtain 


« ^2i r (p+l)e-(p+l)(p+l)p/ 

^2kh2  r(p  + i)  ' 


1+ 


2 m2"  ^ M3 


i 2^2 

ViT 


(21) 


If  we  now  apply  to  the  above  (21)  DeForest’s  notation,  that  is, 
M2  = b and  2m2  ^ M3  = a 


we  obtain  at  once 


/ £ \ a2& — l 

y = yyi+— j e-as 

where 

Vl  _ a 17  MpTiT  e -fr+D  (p+Dp. 

l/27rM2  r(p  + i) 


It  only  remains  to  see  whether  or  not  yx  in  Pearson’s  formula  is 
identical  with  DeForest’s  C. 

We  have 

a.  V 2-k (a2b)  e~a  hd2ba'h~ 1 

V\  = — ?=  r 

V 2irb  T(a2fe) 


Ot  — a‘-5  97  a-6  — i 

= — ae  a2b 

T(a2b ) 

Using  the  approximation  formula  for  r(n)  which  DeForest  uses 
(18)  we  have 
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a a 1/  a2b  _ & 

kVYt  kV%^ 


where 


= 1 


+ 12  a*b  + 288(a26)2  + etC> 


Since  a is  unity  in  DeForest’s  formula,  thus  Pearson’s  formula 
for  the  curve  of  Type  III  immediately  reduces  to  DeForest’s. 
That  is 


y 


1 

kV  2 Tb 


X \a?b  - 1 

ab) 


e-ax. 


Thus  DeForest’s  formula  presents  several  interesting  points 
which  I herewith  enumerate  as  the  conclusion  of  the  present 
report. 

(1)  DeForest’s  investigation  gives  an  additional  proof  for 
the  theoretical  basis  of  Pearson’s  generalized  probability  curve. 

(2)  DeForest’s  investigation  is  interesting  from  an  historical 
standpoint  since  he  actually  obtained  one  of  Pearson’s  curves 
many  years  ago,  and  his  work  suggests  a more  generalized  curve. 

(3)  Since  DeForest’s  formula  (see  (18),  p.  286)  retains  an  ele- 
mentary character,  the  curve  fitting  can  be  accomplished  with 
comparatively  small  labor,  and  it  can  advantageously  be  used  in 
place  of  the  formula  of  Pearson  for  the  curve  of  Type  III. 
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